g

g

,

,

ediction. The protease cleavage pattern discovery problem

a classification problem in machine learning. This chapter

ll introduce several classification analysis or discriminant

alysis algorithms. This chapter will discuss how these

orithms can be used for the protease cleavage pattern

covery. Importantly, how to encode peptides so as to make

coded data biologically sound is the key to make a protease

avage pattern discovery task successful. This chapter will

refore introduce the bio-basis function, a new and cutting-

ge approach for encoding peptides. Based on the bio-basis

nction, several advanced algorithms will be introduced for

protease cleavage pattern discovery.

logy question — protease cleavage

se is an enzyme which interacts with polyproteins for several

l processes in cells [Antalis, et al., 2007]. For instance, a hepatitis

rotease can cleave a polyprotein within an affected cell to release

eins. These released viral proteins will escape from an affected

nvade and affect healthy cells to produce more viral proteins

n, et al., 2007; Scheel, et al., 2008]. An inhibitor is an effective

pproach to stop or slow down the viral affection [Lawitz, et al.,

ng, et al., 2019]. Designing an effective drug against a viral

requires the study of the amino acid composition trend in primary

(peptides) around the cleavage sites of a specific protease

et al., 2018; Betancur-Ancona, et al., 2018].

tide is a sub-sequence with a few or a dozen of the amino acids.

ion of a protease is to cleave a polyprotein at the targeted residues

e functional proteins. The interaction between a protease and a

ein highly depends on the specificity of the amino acids in the

reas of the interaction sites. Such an interaction site between a

ein and a protease in a polyprotein is called a protease cleavage

consequence of such an interaction or cleavage is the split of a

ein leading to the release of functional proteins.